-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update cuML benchmark utility (add LinearSVC and improve Random Forest fairness) #5165
Closed
beckernick
wants to merge
10,000
commits into
rapidsai:branch-23.04
from
beckernick:update-benchmarks
Closed
Update cuML benchmark utility (add LinearSVC and improve Random Forest fairness) #5165
beckernick
wants to merge
10,000
commits into
rapidsai:branch-23.04
from
beckernick:update-benchmarks
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Rapids recently bumped the `xgbooot` to `1.6.0` from `1.5.2` in: rapidsai/integration#487, this PR adapts to those recent changes. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4777
This PR updates raft outdated pinnings in dev yml files. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Thejaswi. N. S (https://github.com/teju85) - Ray Douglass (https://github.com/raydouglass) - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4778
Changes to be in line with: rapidsai/cudf#11058 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4771
Authors: - Jiaming Yuan (https://github.com/trivialfis) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4782
…#4770) Resolves rapidsai#4442 This PR fixes the issue with using mixed data types in regression errors like `mean_squared_error`, `mean_absolute_error` and `mean_squared_log_error`. Authors: - Shaswat Anand (https://github.com/shaswat-indian) Approvers: - William Hicks (https://github.com/wphicks) URL: rapidsai#4770
…th a ColumnTransformer step (rapidsai#4774) This PR fixes a subtle bug in check_array of cuml.thirdparty_adapters.adapters which is the primary cause for the bug. Fix rapidsai#4368. Authors: - https://github.com/VamsiTallam95 - Ray Douglass (https://github.com/raydouglass) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4774
Authors: - Divye Gala (https://github.com/divyegala) - Ray Douglass (https://github.com/raydouglass) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4789
Pin max version of `cuda-python` to `11.7.0` Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#4793
Pin max version of `cuda-python` to `11.7.0` This is a back port of rapidsai#4793. Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers:
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions: > [rapidsai#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. This changeset also removes dependence on `thrust/detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - William Hicks (https://github.com/wphicks) URL: rapidsai#4675
closes rapidsai#4210 Added cosine distance metric for computing epsilon neighborhood in DBSCAN. The cosine distance computed as L2 norm of L2 normalized vectors and the epsilon value is adjusted accordingly. Authors: - Tarang Jain (https://github.com/tarang-jain) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4776
Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Ray Douglass (https://github.com/raydouglass) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4809
Authors: - Micka (https://github.com/lowener) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4805
This PR resolves rapidsai#802 by adding python API for `v_measure_score`. Also came across an [issue](rapidsai#4784) while working on this. Authors: - Shaswat Anand (https://github.com/shaswat-indian) Approvers: - Micka (https://github.com/lowener) - William Hicks (https://github.com/wphicks) URL: rapidsai#4785
Fixes issue rapidsai#2387. For large data sizes, the batch size of the DBSCAN algorithm is small in order to fit the distance matrix in memory. This results in a matrix that has dimensions num_points x batch_size, both for the distance and adjacency matrix. The conversion of the boolean adjacency matrix to CSR format is performed in the 'adjgraph' step. This step was slow when the batch size was small, as described in issue rapidsai#2387. In this commit, the adjgraph step is sped up. This is done in two ways: 1. The adjacency matrix is now stored in row-major batch_size x num_points format --- it was transposed before. This required changes in the vertexdeg step. 2. The csr_row_op kernel has been replaced by the adj_to_csr kernel. This kernel can divide the work over multiple blocks even when the number of rows (batch size) is small. It makes optimal use of memory bandwidth because rows of the matrix are laid out contiguously in memory. Authors: - Allard Hendriksen (https://github.com/ahendriksen) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Tamas Bela Feher (https://github.com/tfeher) URL: rapidsai#4803
This functionality has been moved to RAFT. Authors: - Allard Hendriksen (https://github.com/ahendriksen) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4829
…4804) This PR removes the naive versions of the DBSCAN algorithms. They were not used anymore and were largely incorrect, as described in rapidsai#3414. This fixes issue rapidsai#3414. Authors: - Allard Hendriksen (https://github.com/ahendriksen) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4804
[gpuCI] Forward-merge branch-22.08 to branch-22.10 [skip gpuci]
Pass `NVTX` option to raft in a more similar way to the other arguments and make sure `RAFT_NVTX` option in the installed `raft-config.cmake`. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#4825
[gpuCI] Forward-merge branch-22.08 to branch-22.10 [skip gpuci]
The conda recipe was updated to UCX 1.13.0 in rapidsai#4809 , but updating conda environment files was missing there. Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Jordan Jacobelli (https://github.com/Ethyling) URL: rapidsai#4813
Allows cuML to be installed with CuPy 11. xref: rapidsai/integration#508 Authors: - https://github.com/jakirkham Approvers: - Sevag H (https://github.com/sevagh) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4837
Resolves rapidsai#3403 This PR adds support for using `pandas.Series` as an input to `TfidfVectorizer`, `HashingVectorizer` and `CountVectorizer`. Authors: - Shaswat Anand (https://github.com/shaswat-indian) - Ray Douglass (https://github.com/raydouglass) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4811
Reverts rapidsai#4837 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Ray Douglass (https://github.com/raydouglass) URL: rapidsai#4847
[gpuCI] Forward-merge branch-22.08 to branch-22.10 [skip gpuci]
PR does the required changes for Scikit-build using RAPIDS-CMake. - [x] Update .gitignore - [x] Create `python/cuml/CMakeLists.txt` file - [x] Add `CMakeLists.txt` using RAPIDS-CMake to Python folders - [x] Update `setup.py` - [x] Update `build.sh` - [x] Update CI files - [x] Update conda env files - [x] Clean code Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) - Corey J. Nolet (https://github.com/cjnolet) - Sevag H (https://github.com/sevagh) - Vyas Ramasubramani (https://github.com/vyasr) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#4818
This PR updates the branch reference used for our shared workflows. I will open similar PRs for `branch-23.04` next week. Authors: - AJ Schmidt (https://github.com/ajschmidt8) Approvers: - Ray Douglass (https://github.com/raydouglass)
Forward-merge branch-23.02 to branch-23.04
Authors: - Victor Lafargue (https://github.com/viclafargue) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Carl Simon Adorf (https://github.com/csadorf) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5130
Forward-merge branch-23.02 to branch-23.04
With Python 3.10, there appears to be an issue with the interaction between the staticmethod decorator and Cython. This workaround temporarily switches all staticmethods in FIL to classmethods until the underlying issue can be sorted. Resolve rapidsai#5200. Authors: - William Hicks (https://github.com/wphicks) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5202
Forward-merge branch-23.02 to branch-23.04
Closes rapidsai#5008 Authors: - Micka (https://github.com/lowener) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5148
Forward-merge branch-23.02 to branch-23.04
This PR pins `dask` and `distributed` to `2023.1.1` for `23.02` release. xref: rapidsai/cudf#12695 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Mark Sadang (https://github.com/msadang) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5198
Forward-merge branch-23.02 to branch-23.04
This PR moves the date string from the version to the build string for conda recipes in this repository. This is necessary to ensure that the conda packages resulting from PR builds can be installed in the same environment as nightly conda packages, which is useful for testing purposes. Additionally, this prevents a bug from occurring where the Python builds fail because the date string it computes is different than the one computed by the C++ build, therefore causing the Python build to search for a C++ build artifact that doesn't exist. xref: rapidsai/rmm#1195 Authors: - AJ Schmidt (https://github.com/ajschmidt8) Approvers: - Ray Douglass (https://github.com/raydouglass) URL: rapidsai#5190
closes issue rapidsai#5206 Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Corey J. Nolet (https://github.com/cjnolet)
Authors: - William Hicks (https://github.com/wphicks) Approvers: - Victor Lafargue (https://github.com/viclafargue) - Dante Gama Dessavre (https://github.com/dantegd)
Forward-merge branch-23.02 to branch-23.04
Removed slow modulo operator by minor change in index arithmetic. This gave me following performance improvement for a test case: | | branch-23.02 |kernel-shap-improvments | Gain | |-------------------------|------------------|-------------------------|------| | sampled_rows_kernel | 663 | 193 | 3.4x | | exact_rows_kernel | 363 | 236 | 1.5x | All times in microseconds. Code used for benchmarking: ```python from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestRegressor as rf from cuml.explainer import KernelExplainer import numpy as np data, labels = make_classification(n_samples=1000, n_features=20, n_informative=20, random_state=42, n_redundant=0, n_repeated=0) X_train, X_test, y_train, y_test = train_test_split(data, labels, train_size=998, random_state=42) #sklearn train_test_split y_train = np.ravel(y_train) y_test = np.ravel(y_test) model = rf(random_state=42).fit(X_train, y_train) cu_explainer = KernelExplainer(model=model.predict, data=X_train, is_gpu_model=False, random_state=42, nsamples=100) cu_shap_values = cu_explainer.shap_values(X_test) print('cu_shap:', cu_shap_values) ``` Authors: - Vinay Deshpande (https://github.com/vinaydes) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5187
cjnolet
approved these changes
Feb 8, 2023
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-23.04 #5165 +/- ##
===============================================
Coverage ? 67.17%
===============================================
Files ? 192
Lines ? 12426
Branches ? 0
===============================================
Hits ? 8347
Misses ? 4079
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
ajschmidt8
force-pushed
the
branch-23.04
branch
from
February 13, 2023 18:57
6f2fda7
to
20d2690
Compare
Superseded by #5242 |
rapids-bot bot
pushed a commit
that referenced
this pull request
Mar 6, 2023
…5242) This PR makes several small changes: - Adds `LinearSVC` and `LinearSVR` to the cuML benchmarks. Currently, we run SVC/SVR(linear) to benchmark a linear SVM. The scikit-learn documentation recommends using LinearSVC for large datasets instead for performance reasons. For even 10,000 records, the performance difference is quite significant. As the model quality can differ slightly between SVC(linear) and LinearSVC, we add LinearSVC rather than replace SVC(linear). ```python from sklearn.datasets import make_classification from sklearn.svm import LinearSVC, SVC X, y = make_classification(n_samples=30000, n_features=10) clf = LinearSVC() %time clf.fit(X,y) print(clf.score(X,y)) clf = SVC(kernel="linear") %time clf.fit(X,y) print(clf.score(X,y)) CPU times: user 529 ms, sys: 4.09 ms, total: 534 ms Wall time: 534 ms 0.9278 CPU times: user 5.23 s, sys: 115 ms, total: 5.35 s Wall time: 5.35 s 0.9278 ``` - Adds HDBSCAN to the benchmarks - Updates `RandomForest{Classifier, Regressor}` to use all CPU cores on the machine and to train more than 10 trees. The scikit-learn implementation benefits significantly from using multiple cores, but the benefit is capped by the number of trees. On large machines, using only 10 trees will bias toward slower performance relative to what's possible. As it's rare for people to train Random Forests with only 10 trees, this is changed to a more reasonable (but small) number of 50 trees. ```python clf = RandomForestClassifier(n_estimators=2, n_jobs=1) %time clf.fit(X,y) clf = RandomForestClassifier(n_estimators=2, n_jobs=-1) %time clf.fit(X,y) clf = RandomForestClassifier(n_estimators=6, n_jobs=-1) # three times as many trees, same wall time %time clf.fit(X,y) CPU times: user 3.09 s, sys: 20.9 ms, total: 3.11 s Wall time: 3.1 s CPU times: user 3.14 s, sys: 8.51 ms, total: 3.14 s Wall time: 1.76 s CPU times: user 8.74 s, sys: 19.3 ms, total: 8.76 s Wall time: 1.68 s ``` - Updates RandomForestClassifier to use `max_features="sqrt"` rather than 1.0. This is generally regarded as the appropriate default setting (used in scikit-learn and noted in Hastie's ESL). Using 1.0 as max features takes significantly longer to train on the CPU and results in more correlated trees, which is not expected to improve results. As a result, it's not the ideal "default" characterization of performance. - Refactors the HDBSCAN import utilities into a single `has_hdbscan` utility now that we use more of the CPU library in different areas. This replaces #5165 Authors: - Nick Becker (https://github.com/beckernick) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #5242
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
benchmarking
Cython / Python
Cython or Python issue
improvement
Improvement / enhancement to an existing function
non-breaking
Non-breaking change
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes several small changes:
Adds
LinearSVC
to the cuML benchmarks. Currently, we run SVC(linear) to benchmark a linear SVM. The scikit-learn documentation recommends using LinearSVC for large datasets instead for performance reasons. For even 10,000 records, the performance difference is quite significant. As the model quality can differ slightly between SVC(linear) and LinearSVC, we add LinearSVC rather than replace SVC(linear).Updates
RandomForest{Classifier, Regressor}
to use all CPU cores on the machine and to train more than 10 trees. The scikit-learn implementation benefits significantly from using multiple cores, but the benefit is capped by the number of trees. On large machines, using only 10 trees will bias toward slower performance relative to what's possible. As it's rare for people to train Random Forests with only 10 trees, this is changed to a more reasonable (but small) number of 50 trees.Updates RandomForestClassifier to use
max_features="sqrt"
rather than 1.0. This is generally regarded as the appropriate default setting (used in scikit-learn and noted in Hastie's ESL). Using 1.0 as max features takes significantly longer to train on the CPU and results in more correlated trees, which is not expected to improve results. As a result, it's not the ideal "default" characterization of performance.